Use the Batches API for large-volume, asynchronous, non-time-sensitive workloads to reduce costs by 50% compared to standard Messages API calls.
The Anthropic Batches API is designed for scenarios where you need to process a large number of non-urgent requests. The primary use cases are bulk evaluation of model outputs on test datasets, generating large-scale synthetic datasets for training or fine-tuning, powering offline reporting and analytics pipelines, and performing retroactive classification or data extraction on historical records. The key trade-off is that batch results are not available immediately—processing is asynchronous and typically completes within 24 hours.
Asynchronous processing: Submit a batch of requests and retrieve results later
50% cost reduction: Batch requests are half the price of standard API calls
24-hour processing window: Results are guaranteed to be available within 24 hours (typically much faster)
Not for real-time applications: Use standard Messages API for chat, interactive assistants, or latency-sensitive tasks